Automatic Text Simplification via Synonym Replacement

نویسنده

  • Sture Hägglund
چکیده

In this study automatic lexical simplification via synonym replacement in Swedish was investigated using three different strategies for choosing alternative synonyms: based on word frequency, based on word length, and based on level of synonymy. These strategies were evaluated in terms of standardized readability metrics for Swedish, average word length, proportion of long words, and in relation to the ratio of errors (type A) and number of replacements. The effect of replacements on different genres of texts was also examined. The results show that replacement based on word frequency and word length can improve readability in terms of established metrics for Swedish texts for all genres but that the risk of introducing errors is high. Attempts were made at identifying criteria thresholds that would decrease the ratio of errors but no general thresholds could be identified. In a final experiment word frequency and level of synonymy were combined using predefined thresholds. When more than one word passed the thresholds word frequency or level of synonymy was prioritized. The strategy was significantly better than word frequency alone when looking at all texts and prioritizing level of synonymy. Both prioritizing frequency and level of synonymy were significantly better for the newspaper texts. The results indicate that synonym replacement on a one-to-one word level is very likely to produce errors. Automatic lexical simplification should therefore not be regarded a trivial task, which is too often the case in research literature. In order to evaluate the true quality of the texts it would be valuable to take into account the specific reader. A simplified text that contains some errors but which fails to appreciate subtle differences in terminology can still be very useful if the original text is too difficult to comprehend to the unassisted reader.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Text Simplification via Synonym Replacement

Automatic lexical simplification via synonym replacement in Swedish was investigated. Three different methods for choosing alternative synonyms were evaluated: (1) based on word frequency, (2) based on word length, and (3) based on level of synonymy. These three strategies were evaluated in terms of standardized readability metrics for Swedish, average word length, and proportion of long words,...

متن کامل

Medical text simplification using synonym replacement: Adapting assessment of word difficulty to a compounding language

Medical texts can be difficult to understand for laymen, due to a frequent occurrence of specialised medical terms. Replacing these difficult terms with easier synonyms can, however, lead to improved readability. In this study, we have adapted a method for assessing difficulty of words to make it more suitable to medical Swedish. The difficulty of a word was assessed not only by measuring the f...

متن کامل

CASSA: A Context-Aware Synonym Simplification Algorithm

We present a new context-aware method for lexical simplification that uses two free language resources and real web frequencies. We compare it with the state-of-the-art method for lexical simplification in Spanish and the established simplification baseline, that is, the most frequent synonym. Our method improves upon the other methods in the detection of complex words, in meaning preservation,...

متن کامل

A Tool for Automatic Simplification of Swedish Texts

We present a rule based automatic text simplification tool for Swedish. The tool is designed to facilitate experimentation with various simplification techniques. The architecture of the tool is inspired by and partly built on a previous text simplification tool for Swedish, CogFLUX. New functionality, new operation types, and new simplification operations were added.

متن کامل

A Hybrid System for Spanish Text Simplification

This paper addresses the problem of automatic text simplification. Automatic text simplifications aims at reducing the reading difficulty for people with cognitive disability, among other target groups. We describe an automatic text simplification system for Spanish which combines a rule based core module with a statistical support module that controls the application of rules in the wrong cont...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012